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Abstract: 

We construct three public key knapsack cryptosystems. Standard knapsack cryptosystems hide easy 
instances of the knapsack problem and have been broken. The systems considered in the article face 
this problem: They hide a random (possibly hard) instance of the knapsack problem. We provide both 
complexity results (size of the key, time needed to encypher/decypher...) and experimental results. 
Security results are given for the second cryptosystem ( the fastest one and the one with the shortest 
key). Probabilistic polynomial reductions show that finding the private key is as difficult as factorizing 
a product of two primes. We also consider heuristic attacks. First, the density of the cryptosystem can 
be chosen arbitrarily close to one, discarding low density attacks. Finally, we consider explicit heuristic 
attacks based on the LLL algorithm and we prove that with respect to these attacks, the public key is 
as secure as a random key. 

Introduction 

The principle 

It is natural to build cryptosystems relying on NP-complete problems since NP-complete problems are 
presumably difficult to solve. There are several versions of knapsack problems, all of them being NP- 
complete. Several cryptosystems relying on knapsack problems have been introduced in the eighties 
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We are interested in the bounded version of the knapsack problem. Let s, M, v, v±, . . . , v s £ N. The 
problem is to determine whether there are integers ej, < ej < M such that Y^i=i e i y i = v - ^ n case 
M = 2, the problem is to fill a knapsack of volume v with objects of volume Vi. 

Knapsack cryptosystems are built on knapsack problems. Alice constructs integers m (using some 
private key q) such that the cyphering map C is injective: C : {0, ...,M — \} s — » N, (e,) i— » Yl e i v i' 
The sequence Vi is the public key. When Bob has a plaintext message m € {0, . . . , M — 1} S for Alice, 
he sends the ciphertext C(m). Alice decodes using her private key. 

Strength and weakness of knapsack cryptosystems 

The main advantage of knapsack cryptosystems is the speed. These systems attain very high encryption 
and decryption rates. The knapsack cryptosystem proposed by Merkle-Hellman [7] seemed to be 100 
times faster than RSA for the same level of security at the time it was introduced [S] • 

The main weakness of knapsack cryptosystems is security. All standard knapsack cryptosystems have 
been broken: the Merkle-Hellman cryptosystem by Shamir and Adleman [11] , , the iterated Merkle- 
Hcllmann by Brickell [3] , the Chor-Rivest cryptosystem by Vaudenay in 1997 [T2] ... 

Two main reasons explain the fragility of knapsack cryptosystems. 

First, most of these cryptosystems start with an easy instance. The knapskack problem is NP- 
complete and no fast algorithm to solve it is known in general. However, the knapsack problem is easy 
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to solve for some instances (i>i)i< s : if (uj) is a superincreasing sequence in the sense that m > Ylj<i v ji 
there is a very fast algorithm to solve the knapsack problem, depending linearly on the size of the 
data. For knapsack cryptosystems, the public key is usually a hard instance (u,-) obtained as a function 
v i = f(Qi w i) °f an easy instance (Wi) using a private key q. When Alice receives the message C Vi (m) 
encrypted with the hard instance u,-, she can compute with her private key the message C Wi (to) encrypted 
with the easy instance lOj. Then she decodes easily. 

One could hope that if the private key q is chosen randomly, it is impossible to recover q and the 
message. This intuition is wrong. As an easy instance of the knapsack problem, the initial sequence Wi 
carries information and this information is still present in the ciphertext in a hidden form. This makes 
it possible to break the system. For instance, in the Merklc-Hcllmann scheme, Wi is a superincreasing 
sequence and Shamir has shown that it is possible to recover the initial message to, even if the private 
key q remains unknown. 

Thus, starting from an easy instance and hiding it with a random private key is structurally weak. 
Information can leak, whatever the random choice of the private key. 

Another potential weakness of knapsack cryptosystems is the possibility of low density attacks. 

Usually the numbers (uj)i< s used as the public key are large numbers and the density d = 
s/max\og 2 (vi) is low. In this case, the elements (e,) of the translated lattice L defined by the equation 

e i v i = C(to) are expected to be large, and the plaintext message to sent by Bob to Alice is expected 
to be the smallest element in L. Besides this heuristic argument, this circle of ideas yields a provable 
reduction of the knapsack problem to the closest vector problem CVP ( CVP consists in finding the 
closest point to a fixed point P in a lattice). In particular, using polynomial time algorithms to approx- 
imate CVP Q], the knapsack problem is solvable in polynomial time when the density is low enough 
and the knapsack is sufficiently general : most knapsacks of density roughly less than 2/s are solvable 
in polynomial time [H] . 

When the density is low but not less than 2/s, there is no known polynomial time algorithm to 
solve knapsack problems. However, one can still reduce knapsack problems to CVP. The embedding 
method reduces CVP to the shortest vector problem SVP with high probability when the density d of 
the knapsack is low enough, explicitly when d < 0.9408... ( SVP consists in finding the shortest vector 
in a lattice). Although CVP is NP hard and SVP is NP-hard under randomized reductions [8], there 
are algorithms which solve efficiently CVP and SVP in low dimension, notably LLL based-algorithms. 
In practical terms, a knapsack cryptosystem should have dimension s at least 300 to avoid such attacks. 

Aim of the article 

Summing up, Alice constructs a cryptosystem starting from an instance (wi)i< s and hides it with a 
private key q. The public key vi = Vi(q,Wi) is a function of q and Wi. The above analysis shows that a 
knapsack cryptosystem is potentially weak if one starts with an easy instance (wi)i< s . To construct a 
robust cryptosystem, one should start with a hard instance (wi)i< s , ie the w^s should have no structure 
(chosen randomly). The dimension s should be at least 300. Under these conditions, breaking the 
cryptosystem should be as difficult as recovering the private key q since the existence of the private key 
is the only reason which makes the message received by Alice decipherable. In particular, the difficulty 
to find the private key is expected to be a measure of the security of the system. 

The goal of this paper is to construct such cryptosystems which start with a random instance (wi)i< s 
in high dimension s and such that finding the private key is as difficult as factorising a product of two 
primes. 

Unlike the other knapsack cryptosystems, our construction does not include modular multiplications. 
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Differences and similarities between the three cryptosystems 

The first of our three systems is the most natural. It is a fast system, both for encryption and decryption. 
The drawback is the size of the public key which goes from 0.1MB to 4.9MB depending on the level of 
security considered. 

The size of the public key is subject to debate. Some authors want a short key. Other authors (see 
[3]) think that the concept of a small key should be questioned, and that, in view of the transmission 
rates on the Internet today, it is preferable to have a fast and secure system than a system with a small 
public key. 

The sizes of the keys considered in the first system are large. Though they could be compatible with 
the transmission rates on the internet or the size of the memory of modern computers, it is nevertheless 
desirable to shorten the keys. We thus construct a second system based on the same ideas with a 
shorter key. The size of the key starts from 0.03MB for a reasonably secure system ( corresponding to 
a knapsack problem with s = 500 elements), and is around 0.1MB in dimension s — 1000. 

Our third cryptosystem is a hybrid between the two first cryptosystems. The key is not much longer 
than in the second cryptosystem, but the private key has been hidden more carefully and the system is 
more secure. 

Our three cryptosystems have in common the same underlying one-way function based on the fol- 
lowing remark: it is fast to produce divisions n.j = qxi + ri with small rests rj << q (choose q,Xi,ri 
and compute rij) but it takes more time to recover the divisions once the numbers rij are given. For 
instance, if there is one number n and we look for the smallest rest r = in a division n = qx + r, it 
means that we try to find a factorisation of n. The security of the RSA system relies on the difficulty 
to factorize a product of two primes n = qx. Thus our one way function can be seen as a generalisation 
of the one way function used in the RSA system. Section 11.21 explains this one-way function with more 
details. 

The results 

We provide complexity results, experimental results, and security results for the cryptosystems. 
Complexity results 

There are various possible choices for the parameters. There are two base parameters s,p, with s = o{p) 
and the other parameters depend on s and p. The complexity results for the first system are as follows, 
where e is an arbitrarily small positive number. 

Theorem 1. 

Size of the public key x s : 0(s 2 log 2 (p)) 

Size of the private key e, a, r : 0(s 2 log 2 (p)) 

Encryption time: 0(s 2 \og 2 (p)) 

Decryption time: 0(s 2 log 2 (p)) 1+e 

Creation time of the public key: 0(s 3 log 2 (p) 1+c ) 

Density of the knapsack associated with x s : l/log 2 (p). 

The complexity results for the second system are the following: 

Theorem 2. Size of the public key x\: 0(s 2 + slog 2 (p)) 
Size of the private key : 0(s 2 + s\og 2 (jp)) 
Encryption time: 0(s 2 + slog 2 (p)) 
Decryption time: 0(s 2 + log 2 (p) 1+e ) 
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Time to create the public key: 0(s 2 + log 2 (p) 1+c ) 
Density of the knapsack associated with x s : — - — 2io g2 ( P ) )■ 

For the parameters chosen as in variant 2, we have: 

Theorem 3. Size of the public key X\: 0(s 2 log 2 (p)) 

Size of the private key : 0(s 2 + slog 2 (p)) 

Encryption time: 0(s 2 + slog 2 (p)) 

Decryption time: 0(s 2 + log 2 (p) 1+e ) 

Time needed to create the public key: 0(s 2 + slog 2 (p)) 

Density of the knapsack associated with x s : — - 1 log2(p) )■ 

i — 

By construction, the third system is a hybrid mixing the first and second system. For brevity, we 
have not included its complexity results which can be computed as for the previous two systems. 



Experimental results for the first system 
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TTT 



We report experiments to show that encryption/decryption time is acceptable in high dimension. The 
processor used is an Intel Xeon at 2GHz. The programs have been written with the software Maple 
(slow high level language manipulating nativly arbitrarily large integers) 
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Experimental results for the second system. 
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Decryption time in seconds 
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Time for generating the key in seconds 
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Security results 

We now come to the security analysis of the cryptosystems. Among the three cryptosystems described, 
it is easier to attack the second cryptosystem (shortest key, built to be fast, no special care to hide the 
private key). Thus we concentrate our analysis for this second system. 

First, we remark on the above formulas that the density can be as close to 1 as possible with a 
suitable choice of the parameters. Thus the parameters can be chosen to avoid low density attacks. 

We consider both exact cryptanalyse and heuristic attacks. 

We show that finding the private key q is as difficult as factorising a number n which is a product 
of two primes: if it is possible to find the private key q in polynomial time, then \/rj > 0, it is possible 
to factorise n = pq in polynomial time with a probability of success at least 1 — r\ (theorem [22]). 

In fact, our result is a little more precise. The private key q is an integer with suitable properties. 
One could use a "pseudo-key" q', ie. an integer with the same properties as q, to cryptanalyse the 
system. Our result says that finding a pseudo-key q' with the help some extra-information is as difficult 
as factorising a product of primes (ie. there is a polynomial probabilistic reduction as above). Moreover, 
the system is more secure if q is the only integer with the required properties. We give evidences in 
section 14.11 that one can construct with high probability a cryptosystem with q as the only pseudo-key. 

The above results express that it is difficult to find a pseudo-key. But the cryptosystem could 
still be attacked by heuristic attacks. Since most heuristic attacks rely on the LLL-algorithm and 
its improvements, we consider the standard attack relying on the LLL-algorithm and the embedding 
method. 

NP-completness and many experiments lead to the conclusion that the knapsack problem is not 
solvable for a random instance xo — (vi, . . . , v s ) in high dimension s. The public key is not a random 
instance xq but a slight deformation x\ of xq. A weakness appears if the heuristic attacks perform better 
when the random xq is replaced by x%. 

Our result (theorem |29|) says in substance that, if xq is very general, replacing xq by a suitable x\ 
is not dangerous : both the number of steps to perform the algorithm and the probability of success 
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are unchanged. In other terms, with respect to LLL-attacks, the system is as secure if the message is 
cyphered with xq or with a suitable x\ . 
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1 First system 

1.1 Description of the system 

We denote by M pxq (A) the set of p x q matrices with coefficients in the set A. 

• List of parameters : M, s G N, e G M sXs (N), pi,... ,p s ,qi, ■ ■ ■ , q s G N, x Q G M 1Xs (N), 

• Message to be transmitted: a column vector m G {0, 1, . . . , M — 1} S = M sx i({0, . . . ,M — 1}). 

• Private key: 

• An invcrtible matrix e G M SXS (N) with rows e\,...,e s . We let ||e;||i = X)j=i e v the norm of 
the i th row. 

• A s-tuple of positive rational numbers Xi = — ,i = 1, . . . , s such that (M — l)Ai||e,||i < 1. 

• Recursive Construction: Choose a random row vector xq G N s . Define the row vector Xi, 
i = 1 .. .s by Xi = QiXi-i +Pi€i. 

• Public key: x s 

• Cyphered message: x s m G N. 

Notation 4. We denote by C the cyphering function {0, 1, . . . , M — 1} S — ► N, m i— » iV s = a^.m 

Proposition 5. T/ie function C is injective. 

It suffices to explain how to decypher to prove the proposition. We define Ni, < i < s and Oi 1 
1 < % < s by decreasing induction: 

• N s = C(m) — x s m 

• Ni-i — [^], where [.] denotes the integer part 

• = {Ni-QiN^/pi. 

• Let iV G M s +ixi(N) be the column vector with entries Nq, . . . ,N S 

• Let O G Af sXl (Q) be the column vector with entries 0%, . ..,O s . 

• Let X G M s+ i xs (N) be the matrix with rows Xq, . . . ,x s . 

Proposition 6. The message m verifies Xm — N , em — O. In particular, the coefficients of O are 
integers. 

Proof. We prove that xitn = Ni by decreasing induction on i. The case i = s is true by definition. 
If Xiin = Ni, then (xi-i + Ai£,)m = Ni/q-i. Since a^-im G N and < Aie,m < A,-||e,||i(M — 1) < 1 
by hypothesis, we obtain Xi-\m — [Ni/qi\ — iVj_i, as expected. Thus e^m = (ajj — (qiXi^i))m/pi = 
(N i -q i N i - 1 )/p i = O i . M 

Corollary 7. To decypher the message, 

• Compute N s -i, ■ ■ ■ , N% with the formula A^_i = [— ]. 

• Compute O t = (Ni - q l N l -i)/p l . 

• Solve the system em = O. 
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1.2 Analysis of the system 



The underlying one way function 

We make a quick analysis of the system. 

The couple (q s , e 8 ) in the private key satisfies x s = q s x s -i + p s e s with q s > p s \\e s \\i(M — 1). 
Componentwise, p s e S i is the rest of the division of x S i by q a . These rests are small. The rest of the 
division of x S i by q s is at most q s , and the sum of the rests p s e S i for 1 < i < s is at most sq s in general. 
In the present situation, the sum X^PsEsi = Ps|l e s||i of all the rests is at most jj^i ■ 

In other words, an eavesdropper who tries to break the system looks for an integer q s such that the 
rests of the divisions of the x S i by q s are unusualy small: the sum of the s rests is at most jjzj ■ 

There is hopefully a one way function here. It is easy to construct a couple of integers (x, q) such 
that the rest of the division of x by q is small. But once x is given, it is not easy to find back an integer 
q such that the rest of the division of x by q is small. 

For instance, to obtain a rest which is at most of the divisor q, choose any y, q S N, < e < q/10 n 
and put x = qy + e. As a function of q, the number of operations to compute x is 0(log2(q)). If x is 
given and Eve knows that there is a q satisfying x — qy + e, 10™e < q, trying successivly all possible 
divisors l,...,q requires 0(q) operations. 

Thus, in the absence of a quick algorithm to find q, there is a gain of an exponential factor here. In 
our choice of parameters, the numbers qt will be large to make the most of this advantage. 



Construction of the matrix e 

The matrix e of the private key should be quickly invertible, for instance triangular, to facilitate decryp- 
tion (see corollary [7j). But a triangular matrix e, or any matrix with a lot of null coefficients, would be a 
bad choice. Indeed, if e is sparse, there are two components c, c' of x s = q s x s -\ +p s e s — (••■■, c, c', ....) 
whose gcd is a multiple of q Sl or q s itself. After several attempts, the eavesdropper could find q s . 

The same problem occurs if the components of e s are too small or well localised by a law of repartition. 
If x s — (. . . , c, . . . , c', ...), there is a natural attempt to find q s : test for the gcd of (c — e', d — e") for 
several values of e' , e" . 

Summing up, the matrix e should satisfy the two following conditions: 

• its coefficients are difficult tolocalize, 

• solving em = O is fast. 

If the coefficients of the matrix e are chosen randomly, it takes time to solve em = O. If we choose a 
lower triangular matrix L, an upper triangular matrix U with random uniform coefficients, and choose 
e = LU, then it is easy to solve the system but the coefficients of e are not random uniform and this 
non uniformity could be used to cryptanalyse the system as explained above. 

Thus there is a compromise to find between the amount of time required to compute and invert e 
and the uniformity in the coefficients of e. Our approach to find the compromise is to consider an upper 
triangular matrix U with random coefficients and to deform it using elementary operations (proposition 
El). 

Let L,N € M SXS (N) be the lower triangular matrices defined by Lu — Nu — 1, L,.i — 1, N n ,i = 1 
and all other coefficients equal to zero. If a is a permutation of {l,...,s}, we denote by M a the 
permutation matrix defined by M i r7 ^ — 1 and My = otherwise. 

Proposition 8. Let U £ M SXS (N) be an upper invertible triangular matrix with coefficients Uij, i < j 
chosen randomly in {1, . . . , x} and c, r be permutations of {1, . . . , s}. Then every entry e of the matrix 
e(s,x) = M a LU N M T verifies < e < Ax. In particular, the norm of the lines Ci satisfy \\ei\\i < Asx. 

Proof. The action of the permutations cr, r permute the coefficients of LU N so one can suppose 
a = t = Identity. An entry in U is in {0, . . . , x}. The left multiplication with L replaces a line Li, i > 1 



7 



with Li + L\. The right multiplication with TV replaces a column Ci,i < s with d + C s . Thus an entry 
of LUN is in {0, ... , Ax}. ■ 



1.3 Suggested choice for the parameters 

In this section, suggestions for our list of parameters M, s £ N, e € M SXS (N), pi, . . . ,p s , qi, . . . , q s e N, 
a;o G Mixs(N) are given. We fix two integers s,p as based parameters. The other parameters are 
constant or functions of s and p. 

The level of security depends on the size of s and p. To give an idea of the size of the numbers 
involved, s > 300 and p > 10 6 are sensible choices. 

Suggested choice for the parameters as constants or functions of s,p: 

• M = 2 

• e = e(s, [p/4s]) is the random matrix considered in proposition [51 

• pi — 1, qi chosen randomly in [p + 1, 2p] (uniform law) 

• Xq has entries chosen randomly in [0, 2 s ] (uniform law) 

Comments on the choices. 

The choice M = 2 is to make the system as simple as possible. Moreover, Shamir has shown that 
compact knapsack cryptosystems (ie. those with messages in {0, . . . , M — 1} S and small M) tend to be 
more secure [TU] , 

The reason for the choice of the matrix e has been given before proposition [HJ (compromise between 
randomness and inversibility) . Note that the required condition (M — l)||ej||Aj < 1 is satisfied by 
proposition [3 

As to the choice of Xi = we have explained that t& is large to make the most of the one way 
function. Looking at the recursive definition of Xi, it appears that the Xi's are large when pi is large. 
Thus we take p. L = 1 to limit the size of the key. 

The entries of the initial vector Xq are chosen randomly in [0, 2 s ] so that the density of the knapsack 
cryptosystem associated to xq is expected close to one. If the density is lower, there could be a low 
density attack on xq, and maybe an attack on x s as x s is a modification of xq. On the other hand, it is 
not clear that a higher density is dangerous. It could even be a better choice. Experiments are needed 
to decide. Thus we propose a variant of higher density: 

Variant for the choice of parameters 

• xq has entries chosen randomly in {0, . . . , s 5 }. 

• All other parameters are chosen as before. 

1.4 Complexity results 

The complexity of the cryptosystem is described in the following theorem, using the first variant for the 
choice of parameters (ie. xq has entries in {0, . . . , 2 s }). 

We denote by size(A) the number of bits needed to store an element A and by time(A) the number 
of elementary operations needed to compute A. Recall that, for all e > 0, computing a multiplication 
of two integers p and q takes time(pq) = 0(size(p) + size(q)) 1+<L ) elementary operations [5]. Moreover, 
the complexity of a division is the same as the complexity of a multiplication. 
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Theorem 9. Suppose that s = o(p). Then: 

Size of the public key x s : 0(s 2 log 2 (p)) 

Size of the private key e, qi, a, r : 0(s 2 log 2 (p)) 

Encryption time: 0(s 2 \og 2 (p)) 

Decryption time: 0(s 2 log 2 (p)) 1+c 

Creation time of the public key: 0(s 3 \og 2 (p) 1+t ) 

Density of the knapsack associated with x s : 1/ log 2 (p) . 

Proof. 

• Italloo < p 

• szzedle.Hoo) = 0(log 2 (p)) 

• size(ei) < s size(\ \a\ \ x ) = 0(s\og 2 (p)) 

• size(e) = J2i size (^) = 0(s 2 \og 2 (p)) 

• size(q 1 ,...,q 8 ) = 0(slog 2 (p)) 

• size{p) = size(r) = time(o~) = time(r) = 0(slog 2 (s)) 

• size(private key) = size(e, q\, . . . , g s , cr, r) = 0(s 2 log 2 (p)) 

• \\xi = qix^x + CiHoo < IftlllaJi-iHoo + ||e;||oo < 2p||a;i_i|| 00 +p thus H^Hoo < 3*p*||aro||oo- 

• szzedlxiHoo) = 0(i\og 2 (p) + szze(||xo||oo)) = 0(i\og 2 (p) + s) 

• size(xi) < s size(\\xi\\oo) — 0(is\og 2 (p) + s 2 ) 

• size(public key) = size(x s ) = 0(s 2 log 2 (p)) 

• encryption time = size(public key) — 0(s 2 log 2 (p)) 

• time(xi) = 0(size(qi) 1+e + size(xi-i) 1+£ + size(ei)) = 0(size(xi-i) 1+e ) — 0((is\og 2 (p) + 
s 2)i +e ) < 0((s 2 log 2 (p) 1+e )) 

• time(public key) = ^time(xi) = 0((s 3 log 2 (p)) 1+c ) 

• time(Ni = [N i+1 / qi ]) = 0{size{ qi ) 1+ ' + size(N l+1 ) 1+t ) = 0(log 2 (p) 1+e + size(x i+1 m) 1+e ) < 
0(log 2 (p) 1+e + size(s\\x l+1 \U 1+ ^ = 0(ilog 2 (p) + s ) 1 + e < Odslog^Y+t) 

. Ume(N , ...,N S ) = 0(log 2 (p) S 2 ) 1+£ . 



time{O l = {Ni - = 0{time(Ni)) 



• time{N , . . . , N s , O u . . . ,O s ) = time{N , ...,N S ) = 0(log 2 (p)s 2 ) 1 - 



To solve the linear em = O with e = M a LUNM T . we first suppose that e = U (ie. M a = 
L = N = M T = Id). The entries e in e and O satisfy size(e) = 0(log 2 (pj). Since e = U 
is triangular, solving the system takes a time r = 0(s 2 log 2 (p)) 1+e . We have time(decryption) = 
time(Ni, . . . , N s , 0\, . . . , O s , solving(e.m = O)), thus the decryption takes 0(s 2 log 2 (p)) 1+e operations. 
Since inverting M a , L, N, M T require 0(s 2 ) operations, replacing e = U by e = M a LUNM T does not 
change the complexity. ■ 



Remark 10. • These theoretical results are consistent with the experimental results of the intro- 
duction. 



2 Second system 

2.1 Description of the system 

Since the size of the key is a bit large, we propose a second system to reduce the size of the key. 
The implicit one way function is the same as before. We only change the private key and take a 
superincreasing sequence instead of an invertible matrix. 

• List of parameters:M, s e N, e G N s , pi, <?i G N, x <G Mi xs (N), a permutation a of {1, . . . , s} 
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• Message to be transmitted: a column vector to e {0, 1, . . . ,M — 1} S . 

• Private key: 

• A permutation a of {1, . . . , s} 

• A row matrix e 6 Mi xs (N) such that the sequence e^m, ■ ■ ■ , £ CT (s) is a superincreasing se- 
quence. 

• A positive rational number Ai = such that (M — l)Ai||e||i < 1. 

• Construction: Choose a random row vector Xq £N s . Define the row vector x\ by X\ = giXo+Pi 6 - 

• Public key: x\ 

• Cyphered message: x\m G N. 

Notation 11. FFe denote by C the cyphering junction {0, 1, . . . , M — 1} S — ► N, TO i— > C(to) = xi.to 
Proposition 12. The function C is injective. 

It suffices to explain how to decypher to prove the proposition. We define Ni,N , and O as follows 

• N± = C(to) = x\m 



• O = (N 1 -q 1 N )/p 1 . 

• Let N be the column vector with entries No, Ni. 

• Let X be the matrix with rows xq,x\. 
The same proof as for proposition [5] shows: 

Proposition 13. The initial message to verifies Xm = TV, em = O. 

Now, since is a superincreasing sequence, the map to \— > em is injective and the formula to 
decypher to expresses to ct (,) by decreasing induction on i < s. 

Proposition 14. • m CT ( s ) = 1 if O > e CT (s) an ^ m a(s) = otherwise 
• = 1 «/0 - X^>i *<T(j) m <T(j) ^ £ <r(i) arirf otherwise. 

2.2 Suggestion for the choice of the parameters 

The parameters s and p depend on the required level of security and the other parameters are constant 
or functions of s and p. 
Variant 1. Choose: 

• e CT( i) S [0,p[, e a{2 ) £ [p, 2p[, e CT(s) e [(2 s " 1 - l)p, 2 s - 1 p[ (uniform law) 

• xq in [0,p] (uniform law) 

• pi = 1, M = 2 

• qi € [2 s p, 2 s+1 p] (uniform law) 
Variant 2. Choose 

• xq in [0, 2 s ] (uniform law) 

• the other parameters as above. 

2.3 Complexity results 

As before, we suppose that the parameters s and p satisfy s — o(p). For the parameters chosen as in 
variant 1, we have: 

Theorem 15. Size of the public key x\: 0(s 2 + slog 2 (f>)) 
Size of the private key : 0(s 2 + slog 2 (p)) 
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Encryption time: 0(s 2 + slog 2 (p)) 

Decryption time: 0(s 2 + log 2 (p) 1+c ) 

Time to create the public key: 0(s 2 + log 2 (p) 1+c ) 

Density of the knapsack associated with x s : 2 + 2io S2 ( P ) ) 

For the parameters chosen as in variant 2, we have: 

Theorem 16. Size of the public key x\: 0(s 2 log 2 (p)) 

Size of the private key : 0(s 2 + slog 2 (p)) 

Encryption time: 0(s 2 + slog 2 (p)) 

Decryption time: 0(s 2 +log 2 (p) 1+c ) 

Time needed to create the public key: 0(s 2 + slog 2 (p)) 

Density of the knapsack associated with x s : 2 + 1 i og2(p) )■ 



For brevity, we include the proof only for variant 1. Proof, (for variant 1). 

. \\ Xl = q lXo + e|U < 2 S +V||a;ol|o + IklU < 2 s+1 p 2 + < 2 S+ V 

• size(public key) = size{x\) < s sizeQlxiWoo) — 0(s 2 + s\og 2 (p)). 

• size(e) < slog 2 (p) + 1 + 2 H h (s - 1) = 0{s 2 + slog 2 (p)). 

• size{qi) = 0(s + log 2 (p)) 

• size(x ) = 0(log 2 (p)) 

• size(a) = 0(slog 2 (s)) 

• size(private key) = size(x , q%, e, a) = 0(s 2 + slog 2 (p)). 

• encryption time = size(public key) = 0(s 2 + slog 2 (p)) 

• size(Ni) < log 2 (s\\xi\\ 00 ) = 0(s + log 2 (p)). 

• time(No) < 0(size(Ni) 1+e + size(qi) 1+e ) = 0(s 1+€ + log 2 (p) 1+e ) 

• size(No) = 0(log 2 (s) + log 2 (p)). 

• timeiO) = 0(size(Ni) + size(gi) 1+e + size(N Q ) 1+e ) = 0{s 1+e + log 2 (p) 1+e ) since s < p. 

• O - J2j>i <*y)*n*U) < J2j<i € <y{j) < P + 2p + • ■ • + 2*-^ < 2 l P . 

• time(m a ^) in proposition [H]) = size(0 - Ylj>i e <T(j) m <r(j)) = 0{i + log 2 (p)) 

• time(m) = J2i=i ^ me { m a(i)) = 0(s\og 2 (p) + 1 + 2 H h s) = 0(s\og 2 (p) + s 2 ). 

• decrypt iontime = time(No, O, m) — 0(s 2 + log 2 (p) 1+e ). 

• time(public key) = time(qiXQ+e) = 0(time(e)+time(qi)+time(xo)+size(qi) 1+e +size(xo) 1+ ' ! + 
size(e)) = 0(size(qi) 1+e + size(xo) 1+e + size(e)) since time(e) = 0(size(e)) and similarly for qi 
and Xq. Thus time(public key) = 0(s 2 + log 2 (p) 1+e ) 

• density{knapsack) = log2{ \\ XllU) > s+2+2 i og2{p) = ; , i , hsim ■ 



3 Third system 

Two cryptosystems have been constructed so far. In the second system, the key is shorter than in the 
first one, but the system could be less secure because of the superincreasing sequence. 

This section presents a hybrid system, a compromise between the two previous systems. We still 
use a superincreasing sequence to shorten the key as in the second system, but the matrix e has several 
lines as in the first system to hide more carefully the superincreasing sequence. Hopefully, this is a good 
compromise between security and length of the key. 
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• List of parameters:M, s e N, e <E M 2xs (N), pi, <Zi,P2, <Z2 S N, xq 6 Mi xs (N), cr a permutation of 



• Message to be transmitted: a column vector m £ {0, 1, . . . , M — 1} S . 

• private key: 

• A permutation cr of {1, . . . , s} 

• An invertible 2 x s matrix e with entries in N such that the row /i = e 2 — ei is a superincreasing 
sequence with respect to the permutation a, ie. /V(i)> • • • > Mo-(s) is a superincreasing sequence. 

• Two positive rational numbers K — such that (M — l)Aj||ei|| < 1. 

• Construction: Choose a random row vector xo G N s . Define the row vectors xi,X2 by x\ — 

qiXQ +pi€i, X 2 = q2X\ +P2t2 

• Public key: X2 

• Cyphered message: N2 = G N. 

To decypher, we define Ni, N and O2, 0\ as before, and u = O2 — 0\\ 

• Compute N\ and Nq with the formula iVj_i = [^]- 

• Compute Oi = (N - cnNi-i)/pi. 

• Compute u — O2 — 0\ 



The same proof as for proposition [6] shows: 

Proposition 17. TTie initial message m verifies Xm = N , em — O, fim = u. 

Now, since \x is a superincreasing sequence, the map m 1— > /xm is injective and the formula to decypher 
is as in proposition [Ml 

4 Security results 

In this section, we analyse the security of the second cryptographic system (section [5]). We concentrate 
our attention on this system because it is the easiest system to attack: the key is short and no special 
effort has been done to hide the superincreasing sequence. 

We recall the notations. The private key is q, ei, . . . , e„, xq, <j where xq = (1)1, . . . , v s ), e a ^ is a 
superincreasing sequence and Yn=i e * < 1- The public key is X\ — (u>i, . . . , w s ) where Wi = qVi + e^ 

Obviously, ei — Wi — [^], and a is determined by e. In other words, the whole private key is 
determined by q. We thus call q the private key. 

4.1 Unicity of the pseudo-key 

It is not necessary to find the private key q to cryptanalyse. Any number q' with the same properties 
as q would do the job. We call such a number a pseudo-key. Explicitly, in our context, a pseudo-key 
is an integer q' such that the numbers v[ , r j defined by the euclidean divisions uij = q'v[ + verify 
Si=i r i < an d is a superincreasing sequence up to permutation. 

If there are many pseudo-keys, it is easier to attack the system. For instance, in the Merkell-Hcllman 
modular knapsack cryptanalysed by Shamir- Adleman, there were many pseudo-keys. The strategy of 
Shamir was to find a pseudo-key. 

The experiments made on our cryptosystem show that usually the pseudo-key is unique. We chose 
random instances of the parameters and we count the percentage of cases where the pseudo-key is unique. 
Those results suggest that when s > 200, which are the cases considered in practice, the pseudo-key 
should be unique and equal to the private key with high probability. 



• Let N = \ Ni S M 3x i(N) and X 
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Proposition 18. Consider the second cryptosystem, variant 2. The results of the experiments are as 
follows. 

• s = 5, 20 < p < 35, the pseudo-key is unique in 2 % of the cases. 

• s = 6, 30 < p < 45, the pseudo-key is unique in 46 % of the cases. 

• s = 7, 30 < p < 45, the pseudo- key is unique in 79 % of the cases. 

• s = 8, 40 < p < 55, the pseudo-key is unique in 96 % of the cases. 

Besides this computation, we want to explain why we expect a unique pseudo-key when s is large 
enough. 

For a fixed q' , the rests = Wi mod q' are numbers between . . . gf — 1. In the absence of relation 
between Wi and q' ', these rests are expected to follow a uniform law of repartition in {0, . . . ,q' — 1}. 
Of course the exact law of r*j = u>i mod q' depend on the law of Wi (hence of the law of q,t>i,6i as 
Wi = qvi + ei) and of the choice of q' , but a uniform law is an approximation for the law of rj. 

If one accepts this approximation, the next proposition is an estimation of the probability to find a 
q such that the sum of the rests is bounded by q, as required for a pseudo-key. 

Proposition 19. Let q > 2. Consider the rests ri(q), ... ,r s (q) where rt(q) = Wi mod q. Suppose that 
r\(q), . . . , r s (q) follow independant uniform laws with values in {0, . . . , q — 1}. The probability P that 
EUi r M <q-l satisfies P < (f)^ 1 

Lemma 20. Let a\ > a 2 > ■ ■ ■ > a n and p 1 < p 2 < ■ ■ ■ < p n . Then n^" =1 OiPi < C"=i a i)il2i=iPi)- 

Proof. 0} the lemma (L,:=l a i)\L,i=lPi)- n \ l i=l a iPi = Li=l a iPi + l^i=l a il^k=l,k^iPk~l^i=l a iPi~ 

( n - 1) ELi a *p* = £r=i Efe=i, fe /i oi(p* - Pi) = Ei<i<fc<„(«i - a k )( Pk -Pi) > o. ■ 

Proof, of proposition [W\ We have P(ri(q) = fc) = ~ for every fc G {0, . . . , g— 1}. For < r < g— 1, denote 
by Pq, s ,r the probability that Ei=i r «(9) = r - We show by induction on s > 1 that Pq, s ,o < ^g,s,i ' ' ' 5= 
P 9 , s , 9 _i and that Er=o ^ P 9,^ < (l) s_1 - Thi s is obvious for s = 1. Note that P q ^ s , r = Ei °= ^"' 3 " 1 ' fc . 
In particular, ErST* = g^l^±(g^)^M±^±^l^ < a±i p,..-i.q+-+^..-i.,-i by thc 

lemma. Now the induction implies that the right hand side of the inequality is bounded by ^^"(|) s_2 < 
(I) 8 - 1 for q > 2. M 



Proposition 21. Let s G N be a fixed number and t >> s. Let S s t the number of superincreasing 
sequences r\, . . . , r s with sum t and C s t the number of sequences with sum t. Then is asymptotically 

equal to a(3 1 _ 1) when t tends to infinity. 
2 a 

Proof. The number of sequences r\, . . . ,r s with sum t is ( t+ ^7 l ) and is equivalent to ^zzjj- Remark 
that Sst = Ei=i°^ S s -i i. By induction on s, S s t is equivalent to — s(s _i) . ■ 

(s-l)!2 — 2 — 

Summing up the situation, a number q is a pseudo-key if the sum of the rests ri(q) is less than q and 
if these rests form a superincreasing sequence. By proposition [T9l the probability for the first condition 
is less than (|) s_1 . And by proposition |2"TI the probability that the second condition is satisfied is 
around gggn ■ 

In particular we expect a unique pseudo key q when the number of possible values for q is asymp- 
totically dominated by (|) s_1 2 2 . This is the case for the second system we have constructed with 
the suggested choices of parameters and this gives an explanation to the results of proposition [THJ 
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This is only a heuristic argument (there could be obvious pseudo-keys associated to the private key 
q, for instance q — 1, q + 1 or 2q), However, the general picture is that the unicity of the pseudo-key 
verified empirically in proposition 1181 should be easy to reproduce with other families and other choices 
of parameters. 

4.2 Finding a pseudo-key is as difficult as factorising an integer 

In this section, we show that the problem of finding the exact value of the private key q is as difficult as 
factorizing a integer n, product of two primes. More precisely, we show that an easier problem (finding 
a pseudo-key with the help of some extra-information ) is as difficult as the factorisation of n, in the 
sense of a probabilistic reduction. 

There are several problems, depending on whether one wants to compute one key or all keys, and 
depending on the information given as input. 

• Input of problem 1: the public key w\, . . . , w s . Problem 1: compute all the pseudo-keys q 

• Input of problem 2: the public key wi, . . . , w s . Problem 2: compute one pseudo-key q 

• Input of problem 3: the public key uix, . ■ ■ ,W S and integers n < • • • < r s -i, a range [a, b]. Problem 
3: compute all pseudo-keys q such that the rests of the divisions w-i — qvi + e,, satisfy 6j = for 
< i < s and e s <E [a, b}. 

• Input of problem 4: the public key Wx, . . . ,W S and integers n < • • • < J* s -ij a range [a, b]. Problem 
4: compute one pseudo-key q such that the rests of the divisions Wi = qvi + ej, satisfy ej = rj for 
< i < s and e s <E [a, b}. 

Obviously, it is more difficult to find all the keys than to find one key, and the problem is easier when more 
information is given as input, as long as the definition of "more difficult" is sensible ( polynomial time 
reduction, probabilistic polynomial time reduction ...). In particular, if > stands for "more difficult" 
then problem 1 > problem 2, and problem 1 > problem 3 > problem 4 in the above list. There is 
no proven relation beween problem 2 and problem 4. However, when the pseudo-key is unique, then 
problem 1 = problem 2 and the easiest problem in the list is Problem 4. The previous section explained 
why the pseudo-key is unique for many cryptosystems. Thus the security of the system relies on the 
difficulty to solve Problem 4. We show that solving Problem 4 is as difficult as factorising a product of 
two primes. 

• Input of problem 5: an integer n which is a product of two primes. Problem 5: Find the factors 
p, q of n. 

Theorem 22. If it is possible to solve Problem 4 in polynomial time (with respect to the length of 
the input data), then V77 > 0, it is possible to solve Problem 5 in polynomial time with a probability of 
success at least 1 — 77. 

Proof. Let n be an integer. We make a polynomial time probabilistic reduction to Problem 4 to get 
the factorisation of n = pq. 

Choose any superincreasing sequence < r% < ■ ■ ■ < r s _i. First, try to divide n by all elements q 
with 1 < q < 3 53<=i r i- If tms doesn't succeed, then all the divisors q of n satisfy q > r i- 

Let Wi = n + ri for 1 < i < s — 1. Let r be an integer such that (|) r < rj. Let w s i, . . . ,w sr be 
integers chosen randomly in the range ]5, n[. With these r numbers, we consider r problems Pi, . . . , P r . 
The problem Pk is Problem 4 with input wx,..., io s _i, w s k, r±, . . . , r s _i, a — 0, b = [^]. 

Let q be a proper divisor of n = pq. It satisfies q > 3 Y^l=i 1 r i- Thus, for each k, there is a probability 
x > I that w s k mod q satisfies J2tZi r i < w sk mod q < q. Remark that (1 — x) r < (|) r < 77. Then, 
with probability at least (1 — 77), among the r random choices w s \, . . . ,w sr for w s , one of them w s k 
satisfies r « < Wst mod q < q. We denote by (*) this condition. To conclude, it suffices to show 

that one can find a factorisation of n in polynomial time when (*) is satisfied. 
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We thus suppose that one problem Pk in the list P\, . . . ,P r satisfies the condition (*). Since rj < q, 
the equality w; = qp + is the euclidean division of Wi by q when < i < s. Since the rest e s k of the 
division w s k = q[w s k/q] + e s k satisfies e s k > Si=i r i an d e sk < q < § , it follows that a proper divisor q 
of n is a solution to problem Pk. 

Reciprocally, a solution q of Pk is a divisor of n different from 1 since Wi mod q = r\. This divisor 
of n is not n since the condition e s k G [a, b] is not satisfied for q = n. Thus a polynomial time algorithm 
that solves Problem 4 returns a strict divisor q of n when applied to Pk. Hence the factorisation of n 
in polynomial time. 

A priori, we don't know which problem Pk satisfies (*) in the list Pi, ... , P r . We thus run a multi- 
threaded algorithm which tries to solve in parallel the problems Pi , . . . , P r and which stops as soon as 
it finds a solution for one problem. ■ 



4.3 Comparing LLL attacks on x and x x 

The previous sections have explored the security of the key. It remains to analyse the security of the 
system with respect to heuristic attacks. As most heuristic attacks of knapsack cryptosystems rely on 
variants of the LLL algorithm, we analyse the security of the system for LLL-based heuristic attacks. 

The knapsack problem is NP-complcte and experiments show that the heuristic attacks fail when 
the encryption is done with a well chosen general key xq . In our system, the encryption is realised with 
a key x\ = qxo + e which is a modification of xq, and it could happen that the key x\ is less secure than 
xq. Thus we look for a security result asserting that the key x\ is as secure as xq for LLL-attacks. 

The key x\ could be weaker than xo for two reasons: 

• the heuristic algorithm used to break the system could perform faster for a message encrypted 
with xi than with a message encrypted with x 

• The heuristic could fail for a message encrypted with xo but could succeed for the same message 
encrypted using x\. 

We fix an algorithm to attack the ciphertexts. To measure the speed of the algorithm, we denote 
by n(N) the number of steps of the algorithm when the attack is run on the ciphertext N. To measure 
the probability of success of the algorithm, we introduce the symbol R(N) which is the result of the 
attack ( R(N) = m if the attack succeeds and recovers the plain text message m, R(N) — FAILURE 
otherwise). As the algorithm depends on a matrix M chosen randomly in the unit ball B(l), the precise 
notations are Hm{N) and Rm{N). 

The two keys x and x\ yield two ciphertexts No and N\. The following theorem says that the key 
x\ = qxo + e is as secure as xq both from speed consideration and probability of success of the attack. 
Both the numbers of steps n and the returned message R are unchanged when replacing xq with xi 
provided that two conditions are satisfied: the matrix M must live in a dense open subset and -Lj-^-p- 
must be small enough. These two conditions are compatible with the practice: M is chosen randomly 
and falls with high probability in a dense open subset and ^ is small by the very construction of our 
cryptosystem. 

Theorem 23. Vm, V^O; there exists a dense open subset V C -B(l), there exists rj > such thatVM 6 V, 
Vxi = qxo + e with 44^ < r/: 

• n M {N ) = n M {Ni) 

• Rm{N ) = R m {N 1 ). 

The key arguments of our proof are as follows: 

• The elements x\ and xq are close as points of the projective space 

• The LLL algorithm can be factorized to give an action on the projective level 
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• The number of steps in the algorithm and the result of the algorithm are functions of the input 
which are locally constant on a dense open subset. In particular, replacing xq with x\ does not 
change the number of steps and the result when Xo and Xi are sufficiently close. 
Though the algorithm required for the attack is fixed, its precise form is not important. The key point 
is that it relies on the LLL algorithm and that the additional data M required to run the algorithm is 
chosen randomly. Similar theorems can be obtained with other heuristics relying on the LLL algorithm. 
Thus, besides the precise attack considered, our theorem suggests that replacing the public key xq with 
x\ does not expose our system to LLL-based attacks. 

4.3.1 The LLL-algorithm 

This section shows that the output of the LLL-algorithm depends continuously of the input when the 
input takes value in a dense open subset. 

This is not clear a priori, since the operations performed during the LLL algorithm include non 
continuous functions ( integer parts). We introduce a class of algorithms that we call analytic. The 
LLL algorithm is an analytic algorithm. Analytic algorithms can include non continuous functions in 
the process but their output depends continuously (in fact analytically) of the input when the input is 
general enough. 

Recall that the LLL algorithm takes for input a basis (bi, . . . ,b n ) of a lattice L C R m and computes 
a reduced basis (ci, . . . , c„). We refer to [6J for details. 

Definition 24. Consider an algorithm which makes operations on a datum D € U where U C K™ is an 
open set (each step of the algorithm is a modification of the value of the datum D). Suppose that the 
algorithm is defined by a number of states 0,1, ... ,s and for each state i by: 

• a function fi'.U^R 

• two functions : U — > U and T~ : U — > U 

• two integers i + and i~ in {0, . . . , s). 

The algorithm starts in state 1 with datum D the input of the algorithm. If the algorithm is in state i, 
the datum is D and fi(D) > (resp. f%(D) < 0), then it goes to state i + (resp. i~) with the datum 
T+(D) (resp. Tr(D)). The algorith m terminates in state and returns the value of the datum D when 
it terminates. By convention, we put + = 0~ =0, Tq = T _ = Identityu, /o = 1- 
The algorithm is called analytic if: 

• the test functions /, : U — * R are analytic 

• the transformation functions : U —* U and Tf :U^>U are analytic on a dense open subset 
Ui C U such that V% — U \ U% is a closed analytic subset 

• For every D in U , the algorithm terminates. 

Proposition 25. The LLL alogorithm is analytic. 

Proof. We use the description of the algorithm described in [B], page 119. The datum D handled 
by the algorithm is a basis (b\, . . . ,b n ) of a lattice L. It takes values in the open subset U C (M m ) ra 
parametrising the n-tuples of linearly independent vectors. All the tests functions fi which appear in 
the algorithm of [BJ are analytic (they are even algebraic functions on U). All the functions involved in 
the handling of the basis bi (which correspond to our functions T^~ and T~) are algebraic too, except 
for an integer part [x] which is analytic on the dense open set x ^ N. ■ 

Theorem 26. Let A : U — > U be the output function associated to an analytic algorithm ie. for D G U , 
the value of A(D) is the output of an analytic algorithm with input D. Then there exists a dense open 
subset V C U such that 

• A : V — > U is analytic 
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• the number of steps to compute the output A(D) is locally constant for D G V . 

Proof. We keep the notations of definition [24] In particular, the algorithm starts in state 1 and ends 
in state 0. A sign function e of length length(e) = k is by definition a function e : {1, . . . , k} i— > {+, — }. 
We associate to any sign function of length k a finite sequence no(e), . . . , rife(e) constructed with the 
integers i + and i~ of the analytic algorithm. Explicitly hq(e) = 1, n\(e) = no(e) e *- 1 \ rifc(e) = 
nf c -i(e) e ^ k \ We use below the notation n t instead of rti(e) to shorten the notation. Let A e : U —* U , 
A t = Tnt-i ° ■ ■ • o Tna^ ■ Let g e : U — » R, g e = /„ fc o A e . We define by induction on k = length(e) 
a set W e such that 

• W e C £/ is an open inclusion 

• A e : W e — > U is analytic. 

• D € We =>• the successive states so, . . . , Sfc of the algorithm A applied with input D are so = 
no(e) = 1, si = ni(e),. . . ,Sfe = rifc(e). Moreover, the value of the datum after the algorithm arrives 
in state rife(e) is A e (D). 

• U /e „ gt / l(e ) =fe W / £ is dense in U. 

We start the induction with k — 0, using the convention that there is a unique function e defined on a set 
with k = element and that A c = /d Then W e = U obviously satisfies the list of required conditions. 

Let now k > 0. Let r : {1, . . . , k — 1} i— > {+, — } be the restriction of e to {1, . . . , k — 1}. 

Let W T+ = W T n {D e U,g T {D) > 0} n (A r )~ 1 ([/ nfc _ 1 ) where Z7„ fc _ 1 is the open subset of U where 
T+ k i and T,-^ are analytic. Similarly, let W T _ = W T n {£> £ U,g T (D) < 0} n (A r )- 1 (C/„ fc _ 1 ). The 
disjoint union WV+ ]J W 7 ,-- is dense in W r since the difference is included in the closed analytic subset 
(g T = Q)\jA-\U-U nk _ 1 ). 

Let W e = W T+ if e(k) = + and W e = W T _ if e(fc) = — . Since W T+ U W T _ is dense in W T and since 
^iength(r)=k-i W r is dense in [/ by induction, we obtain the density of L>i en gth(e)=k Ws in [/. 

The other claims of the list are satisfied by construction. 

Let Wk = U e f length kW e . The intersection V = Cik>oWk is equal to the disjoint union 

II w t . 

k,e,length(e)—k, n^— 0,n^_i^0 

The set V is open as a union of open sets, and it is dense in U by Baire's theorem. On each open subset 
We appearing in the disjoint union, the algorithm applied to D returns A e (D) which is analytic and the 
number of steps of the algorithm is length(e), thus it is constant on each open set of the disjoint union. 



Proposition 27. Let bi,...,b n be a basis of a lattice L C R™, m > n. Let (ci,...,c„) = 
LLL(bi, . . . ,b n ) be the reduced basis computed by the LLL algorithm. There exists a dense open subset 
U C (R m )" such that 

• U i * (R m )™ 7 (bi) i— > (ci) is continuous. 

• U — > N, (pi) ^number of steps of the LLL-algorithm is locally constant. 

Proof. Follows from proposition [25] and theorem [26l ■ 

ci\ f h 

Corollary 28. Let ip : U -> SL n (Z), (bx,...,b n ) i-» M such that I ... ) = M I ... I is locally 



constant. 

Proof. The map is continuous with values a discrete set. 
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4.3.2 The heuristic attack 



Let w\, . . . , w s £ N be a public key. Let m £ {0, 1} S be a plaintext message and N = JZ* =1 rrnwi be the 
associated ciphertext. The following attack is well known. 
Heuristic Attack 1. 

• Choose A = 2~ 2s min(wi) 

• Apply the LLL algorithm to the lattice generated by the rows 6, of the matrix D = 



( 


A 












\ 







A 







w 2 














A 


W s 




V 














N 


I 



Any vector a of the reduced basis is a linear combination: 



-s+1 



'ij bj 



• For each vector Cj of the reduced basis, check if the set r^, j < s (or — r^) is equal to m (ie. check 
if rij = or 1, and if X/J-=i r y w i = ^0 

In the above attack, the precise value of the coefficients of the matrix D is not important. The 
precise shape of D has been chosen to speed-up the computations and simplify the presentation, but is 
not required by theoretical considerations. The attack could start with any invertible matrix whose s 
first columns contain small numbers and whose last column is close to the last column of D. Thus the 
following attack is more general and natural. 
Heuristic attack 2. 



Choose A 



-2s 



min(wi) 



Choose coefficients m; 



Let X 



I 



\ 



j < s + 1 with \m,ij\ < 1. Let M — (my) be the corresponding matrix. 



Wi 

w s 
N 



\ 



Apply the LLL algorithm to the lattice generated by the 



rows bi of the matrix 



D = X + AM = 



/ Amu 
Am2i 

Am s i 
\ Am s+M 



Ami s wi + Ami >s+ i \ 

Am 2s w 2 + \m 2 , s +i 

Xm ss w s + Am SjS+ i 

Am s+ i >s A^ + Am s+ i iS+ i J 



\jbj and the coefficients 



Any vector c» of the reduced basis is a linear combination: c» = Y^jlLi 
can be computed during the LLL algorithm. 
• For each vector of the reduced basis, check if the set r^, j < s or — r^, j < s is equal to m. 



4.3.3 Proof of the theorem 

Consider a plain text message m. It can be encrypted with the generic key Xo = (vi, . . . , v s ) or with the 
key x\ = qx + e = (w\, . . . , w s ). The two ciphcrtexts associated with the keys Xq and x\ are denoted 
by iV and N\ . 

We compare below how these two encryptions resist to "Heuristic attack 2" presented above. For 
this algorithm, we need a random matrix M in the unit ball B(l). Recall that we called Hm(N) the 
number of steps of the algorithm when the attack is done on the ciphertext N. Similarly, we defined 
Rm(N) to be the result of the attack (Rm(N) = m if the attack recovers the plain text message m and 
Rm(N) = FAILURE otherwise). 
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Theorem 29. \fm,\fxQ, there exists a dense open subset V C B(l), there exists rj > such that\/M 6 V , 
V.Ti = qxo + e with < r\: 

• n M (N ) = n M {N x ) 

• Rm(N ) = Rm(N 1 ). 

Proof. We keep the notations X, X, D = X + AM introduced in the description of the attack. These 
data depend on the public key x — (wi). We denote by X ,X ,Do and Xi,\\,Di these data for the 
keys xq and x\. 

If C(e, q) is the matrix defined by X\ = q(X + C(e, g)), then C(e, q)) — > when 44^ — > 0. 

If A/ is a matrix with lines 61, ... , b s , and if (ci, . . . , c s ) = LLL(b\, . . . , 6 S ) is the reduced basis 
computed by the LLL-algorithm, we adopt a matrix notation and we denote by LLL(M) the matrix 
with lines c\, . . . , c s . We denote by -0(M) the matrix that gives the base change ie. LLL(M) — i/;(M).M. 
Finally, we denote by n{M) the number of steps to perform the LLL-algorithm on the lines of M. 

According to proposition \%7\ and corollary [551 there exists a dense open subset U where LLL is 
continuous and where n and ^ are locally constant. 

Let V = -^p- n B(l). Thus V is a dense open subset in B(l) where the map Vo : M i-> t/;(£» (M)) 
is continuous. Moreover, the number of steps of the algorithm which computes V'o is locally constant on 
V. 

The analysis of the LLL algorithm given in [S] shows that it is a "projective algorithm" ie, in symbols: 
if p e R, we have LLL(pM) = pLLL(M), ip(pM) = ip(M) and n(pM) = n(M). 

By definition of the attack considered, the result RM{Ni) of the attack is a function of the coefficients 
Tij which appear in the matrix ip(Di(M)). In particular, if tp(Do(M)) — -0(Z?i(M)), then Rm(Xq) = 
Rm(Ni). 

^(Di(M)) - i>{d{X + C(e,q)) + AiM) = ^(X + C{e,q) + >*f) = ^(X + A (^M + ^2))) = 
^°(^x|r ^Ao )• When -U^U — > 0, the argument of V'o tends to M. Since M is in the open set of 
continuity of ipo, and since tpo is locally constant, ipoi^f + \f^~) — ^o(-W) = ip(D (M)) if W is 
small enough. 

Since n is locally constant too, one can do a similar reasoning with n instead of ^ to show that 
n M (N ) - n(D (M)) = n(£>i(M)) = « M (^i). ■ 
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